Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
Prophet is open source software released by Facebook’s Core Data Science team. It is available for download on CRAN and PyPI.
Prophet is used in many applications across Facebook for producing reliable forecasts for planning and goal setting. We’ve found it to perform better than any other approach in the majority of cases. We fit models in Stan so that you get forecasts in just a few seconds.
Get a reasonable forecast on messy data with no manual effort. Prophet is robust to outliers, missing data, and dramatic changes in your time series.
The Prophet procedure includes many possibilities for users to tweak and adjust forecasts. You can use human-interpretable parameters to improve your forecast by adding your domain knowledge.
We’ve implemented the Prophet procedure in R and Python, but they share the same underlying Stan code for fitting. Use whatever language you’re comfortable with to get forecasts.
!pip install prophet # installation prophet
ERROR: Invalid requirement: '#'
# Python
import pandas as pd
from prophet import Prophet
# Python
df = pd.read_csv(r'C:\Users\Hp\Downloads\FbprophetN - Sheet1.csv')
df.head()
| ds | y | |
|---|---|---|
| 0 | 2007-12-10 | 9.590761 |
| 1 | 2007-12-11 | 8.519590 |
| 2 | 2007-12-12 | 8.183677 |
| 3 | 2007-12-13 | 8.072467 |
| 4 | 2007-12-14 | 7.893572 |
df.shape
(2905, 2)
# Python
m = Prophet()
growth: String 'linear', 'logistic' or 'flat' to specify a linear, logistic or flat trend.
not specified, potential changepoints are selected automatically.
if input `changepoints` is supplied. If `changepoints` is not supplied,
then n_changepoints potential changepoints are selected uniformly from
the first `changepoint_range` proportion of the history.
be estimated. Defaults to 0.8 for the first 80%. Not used if
`changepoints` is specified.
Can be 'auto', True, False, or a number of Fourier terms to generate.
Can be 'auto', True, False, or a number of Fourier terms to generate.
Can be 'auto', True, False, or a number of Fourier terms to generate.
and optionally columns lower_window and upper_window which specify a
range of days around the date to be included as holidays.
lower_window=-2 will include 2 days prior to the date as holidays. Also
optionally can have a column prior_scale specifying the prior scale for
that holiday.
seasonality model. Larger values allow the model to fit larger seasonal
fluctuations, smaller values dampen the seasonality. Can be specified
for individual seasonalities using add_seasonality.
components model, unless overridden in the holidays input.
automatic changepoint selection. Large values will allow many
changepoints, small values will allow few changepoints.
with the specified number of MCMC samples. If 0, will do MAP
estimation.
for the forecast. If mcmc_samples=0, this will be only the uncertainty
in the trend using the MAP estimate of the extrapolated generative
model. If mcmc.samples>0, this will be integrated over all model
parameters, which will include uncertainty in seasonality.
uncertainty intervals. Settings this value to 0 or False will disable
uncertainty estimation and speed up the calculation.
iterate over all available backends and find the working one
m.fit(df)
11:33:02 - cmdstanpy - INFO - Chain [1] start processing 11:33:03 - cmdstanpy - INFO - Chain [1] done processing
<prophet.forecaster.Prophet at 0x1f722199940>
dictionary parameter names as keys and the following items: k (Mx1 array): M posterior samples of the initial slope. m (Mx1 array): The initial intercept. delta (MxN array): The slope change at each of N changepoints. beta (MxK matrix): Coefficients for K seasonality features. sigma_obs (Mx1 array): Noise level. Note that M=1 if MAP estimation.
df: pd.DataFrame containing the history. Must have columns ds (date type) and y, the time series. If self.growth is 'logistic', then df must also have a column cap that specifies the capacity at each ds. kwargs: Additional arguments passed to the optimizing or sampling functions in Stan.
The fitted Prophet object.
# Python
future = m.make_future_dataframe(periods=365)
Signature: m.make_future_dataframe(periods, freq='D', include_history=True) Docstring: Simulate the trend using the extrapolated generative model.
periods: Int number of periods to forecast forward. freq: Any valid frequency for pd.date_range, such as 'D' or 'M'. include_history: Boolean to include the historical dates in the data frame for predictions.
pd.Dataframe that extends forward from the end of self.history for the requested number of periods.
future.tail()
| ds | |
|---|---|
| 3265 | 2017-01-15 |
| 3266 | 2017-01-16 |
| 3267 | 2017-01-17 |
| 3268 | 2017-01-18 |
| 3269 | 2017-01-19 |
# Python
forecast = m.predict(future)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()
| ds | yhat | yhat_lower | yhat_upper | |
|---|---|---|---|---|
| 3265 | 2017-01-15 | 8.209884 | 7.445995 | 8.902751 |
| 3266 | 2017-01-16 | 8.534913 | 7.909519 | 9.251536 |
| 3267 | 2017-01-17 | 8.322344 | 7.615603 | 9.074805 |
| 3268 | 2017-01-18 | 8.155006 | 7.429595 | 8.922203 |
| 3269 | 2017-01-19 | 8.166945 | 7.401706 | 8.886312 |
# Python
fig1 = m.plot(forecast)
in the figure, if available.
Optional label name on X-axis
Optional label name on Y-axis
Optional tuple width, height in inches.
Optional boolean to add legend to the plot.
# Python
fig2 = m.plot_components(forecast)
# Python
from prophet.plot import plot_plotly, plot_components_plotly
plot_plotly(m, forecast)
# Python
plot_components_plotly(m, forecast)